15 research outputs found

    VennDiagramWeb: a web application for the generation of highly customizable Venn and Euler diagrams.

    Get PDF
    BackgroundVisualization of data generated by high-throughput, high-dimensionality experiments is rapidly becoming a rate-limiting step in computational biology. There is an ongoing need to quickly develop high-quality visualizations that can be easily customized or incorporated into automated pipelines. This often requires an interface for manual plot modification, rapid cycles of tweaking visualization parameters, and the generation of graphics code. To facilitate this process for the generation of highly-customizable, high-resolution Venn and Euler diagrams, we introduce VennDiagramWeb: a web application for the widely used VennDiagram R package. VennDiagramWeb is hosted at http://venndiagram.res.oicr.on.ca/ .ResultsVennDiagramWeb allows real-time modification of Venn and Euler diagrams, with parameter setting through a web interface and immediate visualization of results. It allows customization of essentially all aspects of figures, but also supports integration into computational pipelines via download of R code. Users can upload data and download figures in a range of formats, and there is exhaustive support documentation.ConclusionsVennDiagramWeb allows the easy creation of Venn and Euler diagrams for computational biologists, and indeed many other fields. Its ability to support real-time graphics changes that are linked to downloadable code that can be integrated into automated pipelines will greatly facilitate the improved visualization of complex datasets. For application support please contact [email protected]

    BAMQL: a query language for extracting reads from BAM files

    No full text
    Abstract Background It is extremely common to need to select a subset of reads from a BAM file based on their specific properties. Typically, a user unpacks the BAM file to a text stream using SAMtools, parses and filters the lines using AWK, then repacks them using SAMtools. This process is tedious and error-prone. In particular, when working with many columns of data, mix-ups are common and the bit field containing the flags is unintuitive. There are several libraries for reading BAM files, such as Bio-SamTools for Perl and pysam for Python. Both allow access to the BAM’s read information and can filter reads, but require substantial boilerplate code; this is high overhead for mostly ad hoc filtering. Results We have created a query language that gathers reads using a collection of predicates and common logical connectives. Queries run faster than equivalents and can be compiled to native code for embedding in larger programs. Conclusions BAMQL provides a user-friendly, powerful and performant way to extract subsets of BAM files for ad hoc analyses or integration into applications. The query language provides a collection of predicates beyond those in SAMtools, and more flexible connectives

    VennDiagramWeb: a web application for the generation of highly customizable Venn and Euler diagrams

    No full text
    Abstract Background Visualization of data generated by high-throughput, high-dimensionality experiments is rapidly becoming a rate-limiting step in computational biology. There is an ongoing need to quickly develop high-quality visualizations that can be easily customized or incorporated into automated pipelines. This often requires an interface for manual plot modification, rapid cycles of tweaking visualization parameters, and the generation of graphics code. To facilitate this process for the generation of highly-customizable, high-resolution Venn and Euler diagrams, we introduce VennDiagramWeb: a web application for the widely used VennDiagram R package. VennDiagramWeb is hosted at http://venndiagram.res.oicr.on.ca/ . Results VennDiagramWeb allows real-time modification of Venn and Euler diagrams, with parameter setting through a web interface and immediate visualization of results. It allows customization of essentially all aspects of figures, but also supports integration into computational pipelines via download of R code. Users can upload data and download figures in a range of formats, and there is exhaustive support documentation. Conclusions VennDiagramWeb allows the easy creation of Venn and Euler diagrams for computational biologists, and indeed many other fields. Its ability to support real-time graphics changes that are linked to downloadable code that can be integrated into automated pipelines will greatly facilitate the improved visualization of complex datasets. For application support please contact [email protected]

    BAMQL:A query language for extracting reads from BAM files

    Get PDF
    BACKGROUND: It is extremely common to need to select a subset of reads from a BAM file based on their specific properties. Typically, a user unpacks the BAM file to a text stream using SAMtools, parses and filters the lines using AWK, then repacks them using SAMtools. This process is tedious and error-prone. In particular, when working with many columns of data, mix-ups are common and the bit field containing the flags is unintuitive. There are several libraries for reading BAM files, such as Bio-SamTools for Perl and pysam for Python. Both allow access to the BAM’s read information and can filter reads, but require substantial boilerplate code; this is high overhead for mostly ad hoc filtering. RESULTS: We have created a query language that gathers reads using a collection of predicates and common logical connectives. Queries run faster than equivalents and can be compiled to native code for embedding in larger programs. CONCLUSIONS: BAMQL provides a user-friendly, powerful and performant way to extract subsets of BAM files for ad hoc analyses or integration into applications. The query language provides a collection of predicates beyond those in SAMtools, and more flexible connectives. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1162-y) contains supplementary material, which is available to authorized users

    A community effort to create standards for evaluating tumor subclonal reconstruction

    No full text
    Tumor DNA sequencing data can be interpreted by computational methods that analyze genomic heterogeneity to infer evolutionary dynamics. A growing number of studies have used these approaches to link cancer evolution with clinical progression and response to therapy. Although the inference of tumor phylogenies is rapidly becoming standard practice in cancer genome analyses, standards for evaluating them are lacking. To address this need, we systematically assess methods for reconstructing tumor subclonality. First, we elucidate the main algorithmic problems in subclonal reconstruction and develop quantitative metrics for evaluating them. Then we simulate realistic tumor genomes that harbor all known clonal and subclonal mutation types and processes. Finally, we benchmark 580 tumor reconstructions, varying tumor read depth, tumor type and somatic variant detection. Our analysis provides a baseline for the establishment of gold-standard methods to analyze tumor heterogeneity

    Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

    Get PDF
    The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts.The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that -80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAFPeer reviewe

    A community effort to create standards for evaluating tumor subclonal reconstruction

    No full text
    Methods for reconstructing tumor evolution are benchmarked in the DREAM Somatic Mutation Calling Tumour Heterogeneity Challenge. Tumor DNA sequencing data can be interpreted by computational methods that analyze genomic heterogeneity to infer evolutionary dynamics. A growing number of studies have used these approaches to link cancer evolution with clinical progression and response to therapy. Although the inference of tumor phylogenies is rapidly becoming standard practice in cancer genome analyses, standards for evaluating them are lacking. To address this need, we systematically assess methods for reconstructing tumor subclonality. First, we elucidate the main algorithmic problems in subclonal reconstruction and develop quantitative metrics for evaluating them. Then we simulate realistic tumor genomes that harbor all known clonal and subclonal mutation types and processes. Finally, we benchmark 580 tumor reconstructions, varying tumor read depth, tumor type and somatic variant detection. Our analysis provides a baseline for the establishment of gold-standard methods to analyze tumor heterogeneity
    corecore